-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report versioning errors in StorageCluster's status #1447
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why we need to do this, but an operator updating it's own CR is probably not a good idea. We need to re-work on the whole version thing. So, putting this on hold while we explore other options.
/hold
It seems late initialization is fine. Can we patch only the version field from inside version check, instead of updating the entire resource? Changes in this PR can be reverted. |
I would want to see a selective patch rather than a full update. We run the risk of overriding a change that came in right after the current reconciliation event, and we don't want to touch any other spec. |
ef8fc19
to
55b704d
Compare
@rexagod @jarrpa @umangachapagain I think I will vote to stay with update and maybe add retries to the status update |
/hold I think I missed Ohad's comment before pushing the latest commit. Nonetheless, I'll put this PR on hold until we reach a common consensus here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took some doing but I think I understand the problem with the Patch() strategy now.
In general a Patch() strategy has its place, and it is not here. Specifically this is because the StorageCluster reconcile loop is designed to collect changes to the Status and push them all at once at the end of the reconcile loop (whether or not reconciliation completed successfully is a different story). As such what Ohad is saying is correct, we shouldn't be "merging" new state information, we should be "replacing" the whole thing since we are taking the role as the source of truth on the current state of the system at that point in time.
Now, it is true that we have a bug in how we deal with the Version field in the spec. But also, the way we're using the version field is completely broken, we should never be changing the spec to begin with. I would rather see the version field be dropped entirely and have the "version" information moved to the Status subresource. That will require a bit of design discussion, but shouldn't be too bad.
All that being said, there is still some worthwhile work being done in this PR:
- I actually like the idea of having the version string be empty in version.go, so that should be kept.
- The versionCheck error condition is also a great addition, so keep that as well.
If you want to change this PR to just those two changes, I'm all right with that. But I would rather have a better conversation on whether we can safely stop using the spec.version field once and for all. We'll have to coordinate with the UI team to know when they'd be able to update the Console code for that, but as it stands it's already broken so what more harm could it do? ¯_(ツ)_/¯
We use version info to generate a version metric . As long as this doesn't break, I'm fine with it. ocs-operator/metrics/internal/version/version.go Lines 1 to 11 in 4f2dfef
|
I think we'll need to include the version string since |
Bugzilla link: https://bugzilla.redhat.com/show_bug.cgi?id=1855339 |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rexagod, umangachapagain The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
New changes are detected. LGTM label has been removed. |
/retest |
Will |
/retest |
Update StorageCluster status on versioning error. Append an error status condition to StorageCluster status field when an error arises during `versionCheck`. Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
@rexagod: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Follow-up PR: #1706 ( |
@rexagod: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@rexagod I think this PR is stale now & no longer required. So can we close this? |
Looking at the latest bz status, I think we still need this PR. |
Not sure if we still need this, but since I've moved to a different team recently, I'd appreciate if someone with bandwidth can pick this up. Closing, TIA. |
Earlier, only the status of the storage cluster was being updated,
because of which, side effects such as version mismatch came in to play.